CHARACTERIZING INTEGERS AMONG RATIONAL NUMBERS WITH 
A UNIVERSAL-EXISTENTIAL FORMULA 



BJORN POONEN 

Abstract. We prove that Z in definable in Q by a formula with 2 universal quantifiers 
followed by 7 existential quantifiers. It follows that there is no algorithm for deciding, given 
an algebraic family of Q-morphisms, whether there exists one that is surjective on rational 
points. We also give a formula, again with universal quantifiers followed by existential 
quantifiers, that in any number field defines the ring of integers. 



1. Introduction 

1.1. Background. D. Hilbert, in the 10th of his famous list of 23 problems, asked for 
an algorithm for deciding the solvability of any multivariable polynomial equation in inte- 
gers. Thanks to the work of M. Davis, H. Putnam, J. Robinson [DPR61j . and Y. Matija- 
sevic [Mat 70 j . we know that no such algorithm exists. In other words, the positive existential 
theory of the integer ring Z is undecidable. 

It is not known whether there exists an algorithm for the analogous problem with Z 
replaced by the field Q of rational numbers. But Robinson showed that the full first-order 
theory of Q is undecidable: she reduced the problem to the corresponding known result for 
Z by showing that Z could be defined in Q by a first-order formula |Rob49t Theorem 3.1]. 
If there were a positive existential formula defining Z in Q, then an easy reduction from Q 
to Z would show that Hilbert's 10th problem over Q would have a negative answer. 

G. Cornelissen and K. Zahidi [CZ06] ask: 

(1) What is the smallest part of the first-order theory of Q that can be proved undecid- 
able? 

(2) How complicated must a formula defining Z in Q be? 

To make these questions precise, they define the positive arithmetical hierarchy as follows: 
Sq = is the set of atomic formulas (which, in the language of rings, are polynomial 
equations), and for n G Z>o, inductively define as the set of formulas consisting of 

any number of existential quantifiers followed by a formula in 11+ , and IT+ +1 as the set of 
formulas consisting of any number of universal quantifiers followed by a formula in £+. Thus, 
for instance, positive existential formulas are equivalent to those in £+ , and the formula 

(yxiix2^yiziiz2) x\w + x\y — x 2 zi = x\z 2 + w 7 

is a n^-formula with one free variable, w. 

As remarked in |CZ06j . Robinson's definition of Z in Q uses a Il^-formula, and it follows 
that the E^-theory of Q is undecidable. Theorems 4.2 and 5.3 of |CZ06j show that a 
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conjecture about elliptic curves implies that Z is definable in Q by a Ilj -formula, and that 
the Il^-theory of Q is undecidable, even if one allows only formulas with a single universal 
quantifier. 

1.2. Our results. We prove unconditionally that Z is definable in Q by a LT^-formula. 
Combining this with the negative answer to Hilbert's tenth problem shows that the Ejj~- 
theory of Q is undecidable. Our proof uses not elliptic curves, but quaternion algebras. 

These results may be restated in geometric terms. By Q-variety we mean a separated 
scheme of finite type over Q. Given a Q-morphism vr: V -> T and t G T(Q), let V t = 7r _1 (i) 
be the fiber. Then 

(a) There exists a diagram of Q-varieties 

V >w 




such that Z equals the set of t G Q = A : (Q) such that V t {Q) -> W t (Q) is surjective. 
(In fact, we may take W = A 3 , and take its map to A 1 to be a coordinate projection.) 
(b) There is no algorithm that takes as input a diagram of Q-varieties 

v w 




T 

and decides whether or not there exists t G T(Q) such that V t (Q) —> W t (Q) is 
surjective. 

In the final section of the paper, we generalize to number fields: we find a n^~-formula 
that in every number field k defines its ring of integers O^. 



2. Quaternion algebras 

We use a quaternion algebra argument similar to that in the proof of [Eis05[ Theorem 3.1]. 
Let V = {2, 3, 5, . . .} be the set of prime numbers. Given a, b G Q x , let H a b be the quaternion 
algebra over Q generated by i and j satisfying i 2 = a, j 2 = b, and ij = —ji. Let A a ^ be the 
set of p G V that ramify in H a ^. Let S a ,b be the set of reduced traces of elements of H a b of 
reduced norm 1. For p G V, define S at b(Q P ) similarly for H a ^ <g> Q p . For any prime power q, 
let U q be the set of s G ¥ q such that x 2 — sx + 1 is irreducible in ¥ q [x]. Let red p : Z p — > ¥ p 
be the reduction map. 

Lemma 2.1. 

(i) Ifp i K,b, then S a , b (Q p ) = Q p . 

(ii) Ifp G A a , b , then red^iUp) C 5 a>b (Q p ) C Z p . 

Proof. We have s G S a fi(Q p ) if and only if x 2 — sx+ 1 is the reduced characteristic polynomial 
of an element of H ab ® Q p . 

(i) If p ^ A a;fe , then H a h ® Q p ~ M 2 (Q p ), and any monic quadratic polynomial is a 
characteristic polynomial. 
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(ii) Now suppose that p G A ai 6. Then H a ^ <g> Q p is the ramified quaternion algebra over 
Qp, and x 2 — sx + 1 is a reduced characteristic polynomial if and only if it is a power of a 
monic irreducible polynomial in Q p [x]. If red p (s) G U p , then x 2 — sx + 1 is irreducible over 
Q p . If s G Q p — Zp, then x 2 — sx + 1 is a product of two distinct factors, by the theory of 
Newton polygons. □ 

Lemma 2.2. Ifa.be Q x and either a > or 6 > 0, ffcen S^j = Q H f| p ^(Qp). 

Proof. This is a special case of the Hasse principle for rational numbers represented by 
quadratic forms: see |Ser73| p. 43, Corollary 1], for example. □ 

Lemma 2.3. For any prime power q, the set U q is nonempty. If q > 11 then U q + U q = ¥ q . 

Proof. We have U q = Tr G F g 2 — F g : N(j3) = 1}), where Tr and N are the trace and 
norm for ¥ q 2/¥ q . Since ¥ q 2 contains q + 1 norm-1 elements, C/ g 7^ 0. Also, — C/ 9 = £/g, so 
G C/g + Given aeFJ with g > 11, we hope to prove a eU q + U q . 

Suppose that q is odd. Write ¥ q 2 = F 9 ( v /c) with c G F ? x - F x2 . Then U q = {2x : x,y G 
F g and x 2 — cy 2 =1}. So a G C/g + C/ g if and only if there exist Xi,yi,x 2 ,y 2 £ ¥ q satisfying 

x\ - cy\ = 1, x\- cy\ = 1, 2xi + 2x 2 = a, yi, y 2 ^ 0. 

These conditions define a smooth curve X in A 4 . Eliminating x 2 shows that the projective 
closure X of X is a geometrically integral intersection of two quadrics in P 3 , with function 
field ¥ q (xi)(yf c(l — x\), a/c(1 — (a/2 — x\) 2 )). So X is of genus 1 with at most 12 punctures 
(the intersections of X with three hyperplanes: y\ = 0, y 2 = 0, and the one at infinity). 

If instead q is even, ¥ q 2 = ¥ q ( , y) where 7 2 + 7 + c = for some c G ¥ q , and we seek an 
Fg-point on the curve X defined by 

x\ + xij/i + cy\ = 1, a; 2 , + X 2V2 + cyl = 1, yi + y 2 = a, y u y 2 ^0. 

The geometric properties of X are the same as in the odd q case. 
For any q > 23, the Hasse bound yields 

#X(Fg) > (g+l-2 v ^)-12>0, 

so a G U q + £/g. If 11 < q < 23 we check U q + U q = ¥ q by exhaustion. □ 

Let A r = 2- 3- 5- 7- ll = 2310. Let T a ^ be the set of rational numbers of the form s + s' + n 
where s, s' G S a ,b and n G {0, 1, 2, . . . , TV — 1}. 

Lemma 2.4. If a,b E Q x and either a > or 6 > 0, i/ien T a> b = f] p gA ^(p)- 

Proof. Let T^ 6 be the right-hand side. Lemmas 12.11 and [2721 imply S a< b ^ 2^ 6 , so T a>b C T^ fe . 

Now suppose t eT' ab . Choose n G {0, 1, 2, . . . , N — 1} such that red p (t — n) G U p + U p for 
all p < 11. For each p > 11, Lemma [2.31 yields red p (t — n) E U p + U p . So we may choose 
s 6 Z such that red p (s), red p (t — n — s) G £/ p for all p G A aj fe. Now s,t — n — s G S a ,b by 
Lemmas O and O So i G T a>6 . □ 

Remark 2.5. It follows that the set of (a, 6, c) G Q x x Q x x Q such that at least one of a and 
b is positive and such that c is integral at all primes ramifying in H a ^ is diophantine over Q. 
This adds to the toolbox that might someday be useful for a negative answer to Hilbert's 
Tenth Problem over Q. Given a prime p, it is possible to choose a, b, a', b' G Q>o with 
A ab fl A a / >( / = {p}, so that T a b + = ^(p)i thus we also quickly recover the well-known 
fact that Z(p) is diophantine over Q. 
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Lemma 2.6. We have flo &eQ> ^ = 



Proof. By Lemma I2.4[ it suffices to show that for each p £ V there exist a, 6 £ Q>o such 
that H a b is ramified at p. If p = 2, take a = b = 7. If p > 2, take a = p and choose b £ Z>o 
with red p (6) G — F* 2 . □ 

3. Definition of Z 

Theorem 3.1. The set Z equals the set oft£Q for which the following IXj" -formula is true 
over Q: 

(Va, b)(3a 1} a 2 , a 3 , a 4 , &i, 62, 63, &4, £i, ^2, £3, x 4 , yi, 2/2, 2/3, 2/4, n) 

(a + a? + a\ + a 2 + a 2 )^ + 6 2 + 6 2 + 6 2 + 6 2 ) 
■ \{x\ — ax 2 — bx\ + 06x4 — l) 2 + (y\ — ay\ — by\ + aby\ — l) 2 
+ n 2 (n - l) 2 • ■ ■ (n - 2309) 2 + (2x x + 2 Vl + n - t) 2 ] = 0. 

Proof. The set of a for which there exist a± , . . . , 04 such that a + a 2 + a 2 , + a 2 + a 2 = 
are those satisfying a < 0. Thus removing this factor and the corresponding factor for b is 
equivalent to restricting the domain of a, b to Q>o- Now the theorem follows directly from 
Lemma 12.61 □ 

4. Reducing the number of quantifiers 

The formula in Theorem 13.11 contains 2 universal quantifiers followed by 17 existential 
quantifiers. We do not see how to reduce the number of universal quantifiers. But we can 
reduce the number of existential quantifiers: 

Theorem 4.1. It is possible to define Z in Q with a II^" -formula with 2 universal quantifiers 
followed by 7 existential quantifiers. 

The proof of Theorem 14.11 requires the following: 

Lemma 4.2. We have f] a b£Q T a 2 +b 2 +lia2+a+1+b2 = Z. 



Proof. Since a 2 + b 2 + 1 and a 2 + a + 1 + b 2 are always positive, Lemma 12.61 shows that the left 
hand side contains Z. For the opposite inclusion, by Lemma |2~41 we must show that for every 
p £ V there exist a,b £ Q such that H a 2 +b 2 +la 2 +a+1+b 2 is ramified at p. For given a, b, p the 
ramification may be tested by computing a Hilbert symbol as in |Ser73t p. 20, Theorem 1], 
for example. 

If p = 2 or p = 3, take a = — 1 and 6 = 1. If p = 5 or p = 7, take a = 2 and 6 = 0. 
Suppose p > 11. Choose c £ F* — F^ 2 . The affine curve X defined by c 2 x 4 + y 2 + 1 = 
and x 7^ in A| is a genus- 1 curve with 4 punctures, so 

#X(F p ) > ( p+ l-2 v ^)-4>0. 

Choose (x , j/o) £ X(¥ p ). Choose a, 6 G Z with red p (a) = cXq and red p (6) = y . Then 

red p (a 2 + 6 2 + 1) = c 2 x\ + y 2 + l= 0. 

By adding p to Xo if necessary, we may assume in addition that a 2 + 6 2 + 1 ^ (mod p 2 ). 
Also, a 2 + a + l + 6 2 = a (mod p), and a is not a square modulo p. Thus H a 2 +b 2 +l a 2 +a+1+b 2 
is ramified at p. □ 
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Remark 4.3. For any nonzero rational functions f(t),g(t) G Q(£), Tsen's theorem implies 
that the quaternion algebra Hf( t ), g (t) over Q(t) is split by Q(t) and hence by k(t) for some 
number field k, and hence by Q p (t) for any prime p splitting in k; for such p, we have that 
Hf( a ),g(a) ® Qp is split for all a G Q such that f(a) and g(a) are defined and nonzero. 

Proof of Theorem \4-l\ Starting with the formula in Theorem 13.11 we first replace a and b 
by a 2 + b 2 + 1 and a 2 + a + 1 + b 2 , respectively, each time they appear in the polynomial 
equation: this renders the a, and bi unnecessary; and the resulting formula still defines Z, 
by Lemma 14.21 Next we solve 1x\ + 2y± + n — t = for y\ to eliminate y± (and we clear 
denominators). Finally, the quantifier for n is unnecessary because n takes on only finitely 
many values. The resulting formula is 

(Va, b)(3xi, x 2 , x 3 ,Xi,y 2 ,y3, Vi) 

[x\ - (a 2 + b 2 + l)x\ - (a 2 + a + 1 + b 2 )x\ + (a 2 + b 2 + l)(a 2 + a + 1 + b 2 )x\ - l] 2 

2309 

+ J[ [{n - t - 2xi) 2 - 4(a 2 + b 2 + l)y 2 - 4(a 2 + a + 1 + 6 2 )y 2 

n=0 

+ 4(a 2 + b 2 + l)(a 2 + a + 1 + 6 2 )y 2 - 4] 2 = 0. □ 

We can also give a new proof of the following result, which was first proved by G. Cor- 
nelissen and A. Shlapentokh. 

Theorem 4.4 (Cornelissen and Shlapentokh). For every e > 0, there is a set R of primes 
of natural density at least 1 — e such that Z is definable in Z[i? _1 ] using a -formula with 
just one universal quantifier (instead of two). 

Proof. Given e, choose a positive integer m such that 2 _m < e, and let B be the set of the 
first m primes. Let R be the set of p £ V that fail to split in Q(v / 6) for at least one b G B. 
The density of R equals 1 — 2~ m > 1 — e. 

For fixed b > 0, the set Uaez^- 1 ] ^-afi ec L ua ls the set of primes that fail to split in Q(Vb), 
so f] ae i[ R -i] Tafi = IjIS^ 1 }, where Sb is the set of primes that split in Q(V&). Thus 

z[ir x ]np| p| r a , b = z. 

The set on the left is definable in Z[i? _1 ] by a LT^-formula, since positive existential formulas 
over Q may be modeled by equivalent positive existential formulas over Z[i2 -1 ]. Moreover, 
only one universal quantifier (for a) is needed, since b ranges over only finitely many variables. 

□ 

Remark 4.5. The proof of Theorem 14.41 shows also that for every e > 0, there is a subset 
S C V of density less than e (namely, f] beB Sb) such that Z[S' _1 ] is definable in Q by a 
n^-formula with just one universal quantifier. 

5. Defining rings of integers 

Theorem 5.1. There is a -formula that in any number field k defines its ring of integers. 
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Proof. Let (23 be a primitive 23 rd root of 1 in an algebraic closure of k. Let f(x) G Z[x] be 
the minimal polynomial of £23 + C23 1 over Q- 

Case 1: k contains a zero of / and a zero of x 2 + 23. Then k D Q(C23), so the residue field 
at every prime of k not above 23 contains a primitive 23 rd root of 1. In particular, every 
residue field is an ¥ q with q > 11, so Lemma [2.31 always applies. Also k has no real places. 
Thus the argument of Section [2] shows that for any a, b G k x , the analogously defined T a ^ 
(without the n) equals the set of elements of k that are integral at every prime ramifying in 
H a h . We can require a,b G k x by adding an equation abc —1 = 0. So C\ is defined in k by 
the following formula $: 

(Va, 6)(3c, xi, x 2 , x 3 , x 4 , Vi, 2/2, 2/3, 1/4) 
(a&c - l) 2 + (x 2 - ax 2 , - bxl + abx\ - l) 2 + (y 2 - ay 2 - by\ + aby\ - l) 2 + (2xi + 2y x - t) 2 = 0. 

Case 2: k contains a zero of / but not a zero of x 2 + 23. By using Case 1 and a n^~- 
analogue of the "Going up and then down" method [Shl07l Lemma 2.1.17] (i.e., modeling a 
formula over k! := k[x]/(x 2 + 23) by a formula over k, by restriction of scalars), we find a 
n^-formula ^ defining Ok in k. 

Cases 1 and 2: For any number field fc containing a zero of /, we use 

(((3u) n 2 + 23 = 0) A $) V (((Vv)(3«;) w(v 2 + 23) = 1) A 

which, when written in positive prenex form, is a Il^-formula, by [CZ06t Lemma 1.20.1]. 

The same approach of dividing into two cases lets us generalize to include the case where 
k does not contain a zero of /: in this case, / is irreducible over k, since Q(C23 + C23 1 ) i s 
abelian of prime degree over Q. □ 
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